Determination of the optimal number of features for quadratic discriminant analysis via the normal approximation to the discriminant distribution
نویسندگان
چکیده
Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically, for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The problem is especially acute when sample sizes are very small and the potential number of features is very large. To obtain a general understanding of the kinds of feature-set sizes that provide good performance for a particular classification rule, performance must be evaluated based on accurate error estimation, and hence a model-based setting for optimizing the number of features is needed. This paper treats quadratic discriminant analysis (QDA) in the case of unequal covariance matrices. For two normal class-conditional distributions, the QDA classifier is determined according to a discriminant. The standard plug-in rule estimates the discriminant from a feature-label sample to obtain an estimate of the discriminant by replacing the means and covariance matrices by their respective sample means and sample covariance matrices. The unbiasedness of these estimators assures good estimation for large samples, but not for small samples. Our goal is to find an essentially analytic method to produce an error curve as a function of the number of features so that the curve can be minimized to determine an optimal number of features. We use a normal approximation to the distribution of the estimated discriminant. Since the mean and variance of the estimated discriminant will be exact, these provide insight into how the covariance matrices affect the optimal number of features. We derive the mean and variance of the estimated discriminant and compare feature-size optimization using the normal approximation to the estimated discriminant with optimization obtained by simulating the true distribution of the estimated discriminant. Optimization via the normal approximation to the estimated discriminant provides huge computational savings in comparison to optimization via simulation of the true distribution. Feature-size optimization via the normal approximation is very accurate when the covariance matrices differ modestly. The optimal number of features based on the normal approximation will exceed the actual optimal number when there is large disagreement between the covariance matrices; however, this difference is not important because the true misclassification error using the number of features obtained from the normal approximation and the number obtained from the true distribution differ only slightly, even for significantly different covariance matrices. 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
منابع مشابه
Feature reduction of hyperspectral images: Discriminant analysis and the first principal component
When the number of training samples is limited, feature reduction plays an important role in classification of hyperspectral images. In this paper, we propose a supervised feature extraction method based on discriminant analysis (DA) which uses the first principal component (PC1) to weight the scatter matrices. The proposed method, called DA-PC1, copes with the small sample size problem and has...
متن کاملIntrusion Detection in IOT based Networks Using Double Discriminant Analysis
Intrusion detection is one of the main challenges in wireless systems especially in Internet of things (IOT) based networks. There are various attack types such as probe, denial of service, remote to local and user to root. In addition to known attacks and malicious behaviors, there are various unknown attacks that some of them have similar behavior with respect to each other or mimic the norma...
متن کاملA prediction distribution of atmospheric pollutants using support vector machines, discriminant analysis and mapping tools (Case study: Tunisia)
Monitoring and controlling air quality parameters form an important subject of atmospheric and environmental research today due to the health impacts caused by the different pollutants present in the urban areas. The support vector machine (SVM), as a supervised learning analysis method, is considered an effective statistical tool for the prediction and analysis of air quality. The work present...
متن کاملDiscrimination of Golab apple storage time using acoustic impulse response and LDA and QDA discriminant analysis techniques
ABSTRACT- Firmness is one of the most important quality indicators for apple fruits, which is highly correlated with the storage time. The acoustic impulse response technique is one of the most commonly used nondestructive detection methods for evaluating apple firmness. This paper presents a non-destructive method for classification of Iranian apple (Malus domestica Borkh. cv. Golab) according...
متن کاملA prediction distribution of atmospheric pollutants using support vector machines, discriminant analysis and mapping tools (Case study: Tunisia)
Monitoring and controlling air quality parameters form an important subject of atmospheric and environmental research today due to the health impacts caused by the different pollutants present in the urban areas. The support vector machine (SVM), as a supervised learning analysis method, is considered an effective statistical tool for the prediction and analysis of air quality. The work present...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 38 شماره
صفحات -
تاریخ انتشار 2005